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Abstract 

This article presents a new semantic- 
based transfer approach developed and 
applied within the Verbmobi./ Machine 
Translation project. We give an overview 
of the declarative transfer formalism to- 
gether with its procedural realization. 
Our approach is discussed and compared 
with several other approaches from the 
MT literature. The results presented in 
this article have been implemented and 
integrated into the Verbmobi/ system. 

1 Introduction 

The work presented in this article was developed 
within the Verbmobi/ project (Kay et al., 1994; 
Wahlster, 1993). This is one of the largest projects 
dealing with Machine Translation (MT) of spo- 
ken language. Approximately 100 researchers in 
29 public and industrial institutions are involved. 
The application domain is spontaneous spoken 
language in face-to- face dialogs. The current sce- 
nario is restricted to the task of appointment 
scheduling and the languages involved are English, 
German and Japanese. 

This article describes the realization of a trans- 
fer approach based on the proposals of (Abb and 
Buschbeck-Wolf, 1995; Caspari and Schmid, 1994) 
and (Copestake, 1995). Transfer-based MTU, see 
e.g. (Vauquois and Boitet, 1985; Nagao et al., 
1985), is based on contrastive bilingual corpus 
analyses from which a bilingual lexicon of trans- 
fer equivalences is derived. In contrast to a purely 
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lexicalist approach which relates bags of lexical 
signs, as in Shake-and-Bake MT (Beaven, 1992; 
Whitelock, 1992), our transfer approach operates 
on the level of semantic representations produced 
by various analysis steps. The output of transfer is 
a semantic representation for the target language 
which is input to the generator and speech synthe- 
sis to produce the target language utterance. Our 
transfer equivalences abstract away from morpho- 
logical and syntactic idiosyncracies of source and 
target languages. The bilingual equivalences are 
described on the basis of semantic representations. 

Since the Verbmobi/ domain is related to dis- 
course rather than isolated sentences the model 
theoretic semantics is based on Kamp's Discourse 
Representation Theory, DRT (Kamp and Reyle, 
1993). In order to allow for underspecification, 
variants of Underspecified Discourse Representa- 
tion Structures (UDRS) (Reyle, 1993) are em- 
ployed as semantic formalisms in the different 
analysis components (Bos et al., 1996; Egg and 
Lebeth, 1995; Copestake et ah, 1995). 

Together with other kinds of information, such 
as tense, aspect, prosody and morpho-syntax, 
the different semantic representations are mapped 
into a single multi-dimensional representation 
called Verbmobi/ Interface Term (VIT) (Dorna, 
1996). This single information structure serves as 
input to semantic evaluation and transfer. The 
transfer output is also a VIT which is based 
on the semantics of the English grammar (cf. 
Copestake et al. (1995)) and used for generation 
(see Kilger and Finkler (1995) for a description of 
the generation component). 

Section || of this paper sketches the seman- 
tic representations we have used for transfer. In 
section [| we introduce transfer rules and dis- 
cuss examples. In section |] we compare our 
approach with other MT approaches. In sec- 
tion ^| we present a summary of the implemen- 
tation aspects. For a more detailed discussion of 
the implementation of the transfer formalism see 
Dorna and Emele (1996). Finally, section ^ sum- 
marizes the results. 



2 Semantic Representations 

The different Verbmobi/ semantic construction 
components use variants of UDRS as their seman- 
tic formalisms, cf. (Bos et al., 1996; Egg and Le- 
beth, 1995; Copestake et al., 1995). The ability 
to underspecify quantifier and operator scope to- 
gether with certain lexical ambiguities is impor- 
tant for a practical machine translation system 
like Verbmobi/ because it supports ambiguity pre- 
serving translations. The disambiguation of dif- 
ferent readings could require an arbitrary amount 
of reasoning on real-world knowledge and thus 
should be avoided whenever possible. 

In the following examples we assume an ex- 
plicit event-based semantics (Dowty, 1989; Par- 
sons, 1991) with a Neo-Davidsonian representa- 
tion of semantic argument relations. All seman- 
tic entities in UDRS are uniquely labeled. A la- 
bel is a pointer to a semantic predicate making it 
easy to refer to. The labeling of all semantic enti- 
ties allows a flat representation of the hierarchical 
structure of argument and operator and quantifier 
scope embeddings as a set of labeled conditions. 
The recursive embedding is expressed via addi- 
tional subordination constraints on labels which 
occur as arguments of such operators. 

Example (la) shows one of the classical 
Verbmobi/ examples and its possible English 
translation (lb). 

(1) a. Das paflt echt schlecht bei mir. 
b. That really doesn't suit me well. 

The corresponding semantic representations are 
given in (2a) and (2b), respectively^ 

(2) a. [11 : echt (12) , 12 : schlecht (il) , 

13:passen(il) , 13: arg3(il , i2) , 
14:pron(i2), 15 :bei (il , i3) , 16:ich(i3)] 
b. [ll:real(12) , 12:neg(17), 17:good(il), 
13:suit(il), 13 : arg3(il , i2) , 
14:pron(i2), 15 : arg2(il , i3) , 16:ego(i3)] 

Semantic entities in (2) are represented as a Pro- 
log list of labeled conditions. After the unification- 
based semantic construction, the logical variables 
for labels and markers, such as events, states and 
individuals, are skolemized with special constant 
symbols, e.g. 11 for a label and il for a state. Ev- 
ery condition is prefixed with a label serving as a 
unique identifer. Labels are also useful for group- 
ing sets of conditions, e.g. for partitions which be- 
long to the restriction of a quantifier or which 
are part of a specific sub-DRS. Additionally, all 
these special constants can be seen as pointers for 
adding or linking information within and between 
multiple levels of the VIT. 

Only the set of semantic conditions is shown in 
(2); the other levels of the multi-dimensional VIT 
representation, which contain additional semantic, 

2 For presentation purposes we have simplified the 
actual VIT representations. 



pragmatic, morpho-syntactic and prosodic infor- 
mation, have been left out here. If necessary, such 
additional information can be used in transfer and 
semantic evaluation for resolving ambiguities or in 
generation for guiding the realization choices. Fur- 
thermore, it allows transfer to make fine-grained 
distinctions between alternatives in cases where 
the semantic representations of source and target 
language do not match up exactly. 

Semantic operators like negation, modals or in- 
tensifier adverbials, such as really, take extra label 
arguments for referring to other elements in the 
flat list wiiich arc in the relative scope of these 
operators.!^ 

This form of semantic representation has the 
following advantages for transfer: 

• It is possible to preserve the underspecifica- 
tion of quantifier and operator scope if there 
is no divergence regarding scope ambiguity 
between source and target languages. 

• Coindexation of labels and markers in the 
source and target parts of transfer rules en- 
sures that the semantic entities are correctly 
related and hence obey any semantic con- 
straints which may be linked to them. 

• To produce an adequate target utterance 
additional constraints which are important 
for generation, e.g. sortal, topic/focus con- 
straints etc., may be preserved. 

• There need not be a 1 : 1 relation between 
semantic entities and individual lexical items. 
Instead, lexical units may be decomposed into 
a set of semantic entities, e.g. in the case of 
derivations and for a more fine grained lexical 
semantics. Lexical decomposition allows us to 
express generalizations and to apply transfer 
rules to parts of the decomposition. 

3 Our Transfer Approach 

Transfer equivalences are stated as relations be- 
tween sets of source language (SL) and sets of tar- 
get language (TL) semantic entities. They are usu- 
ally based on individual lexical items but might 
also involve partial phrases for treating idioms and 
other collocations, e.g. verb-noun collocations (see 
example (8) below) . After skolemization of the se- 
mantic representation the input to transfer is vari- 
able free. This allows the use of logical variables 
for labels and markers in transfer rules to express 
coindexation constraints between individual enti- 
ties such as predicates, operators, quantifiers and 

3 For the concrete example at hand, the relative 
scope has been fully resolved by using the explicit la- 
bels of other conditions. If the scope were underspeci- 
fied, explicit subordination constraints would be used 
in a special scope slot of the VIT. The exact details 
of subordination are beyond the scope of this paper, 
cf. Frank and Reyle (1995) and Bos et al. (1996) for 
implementations. 



(abstract) thematic roles. Hence the skolemiza- 
tion prevents unwanted unification of labels and 
markers while matching individual transfer rules 
against the semantic representation. 

The general form of a transfer rule is given by 

SLSem, SLConds TauOp TLSem, TLConds . 

where SLSem and TLSem are sets of semantic enti- 
ties. TauOp is an operator indicating the intended 
application direction (one of <->,->,<-). SLConds 
and TLConds are optional sets of SL and TL con- 
ditions, respectively. All sets are written as Prolog 
lists and optional conditions can be omitted. 

On the source language, the main difference be- 
tween the SLSem and conditions is that the for- 
mer is matched against the input and replaced by 
the TLSem, whereas conditions act as filters on the 
applicability of individual transfer rules without 
modifying the input representation. Hence condi- 
tions may be viewed as general inferences which 
yield either true or false depending on the context. 
The context might either be the local context as 
defined by the current VIT or the global context 
defined via the domain and dialog model. Those 
inferences might involve arbitrarily complex infer- 
ences like anaphora resolution or the determina- 
tion of the current dialog act. In an interactive 
system one could even imagine that conditions are 
posed as yes/no-questions to the user to act as a 
negotiator (Kay et al., 1994) for choosing the most 
plausible translation. 

If the translation rules in (3) are applied to the 
semantic input in (2a) they yield the semantic out- 
put in (2b). We restrict the following discussion 
to the direction from German to English but the 
rules can be applied in the other direction as well. 
(3) a. [L:echt(A)] <-> [L: real (A)]. 

b.[L:passen(E) ,L:arg3(E,Y) ,L1 :bei (E,X)] <-> 

[L:suit(E) ,L:arg2(E,X) ,L:arg3(E,Y)] . 
C [L: schlecht (E)] , [LI :passen(E)] <-> 

[L:neg(A),A:good(E)] . 

d. [L:ich(X)] <-> [L:ego(X)]. 

e. [L:pron(X)] <-> [L:pron(X)] . 

The simple lexical transfer rule in (3a) relates tbe 
German intensifier echt with the English realQ. 
The variables L and A ensure that the label and 
the argument of the German echt are assigned to 
the English predicate real, respectively. 

The equivalence in (3b) relates the German 
predicate passen with the English predicate suit. 
The rule not only identifies the event marker E, 
but unifies the instances X and Y of the relevant 
thematic roles. Despite the fact that the German 
&ei-phrase is analysed as an adjunct, it is treated 
exactly like the argument arg3 which is syntacti- 
cally subcategorized. This rule shows how struc- 
tural divergences can easily be handled within this 
approach. 

4 The semantic predicate real abstracts away from 
the adjective/adverbial distinction. 



(4) [L: passen (E) , Ll:bei(E,X)] <-> 

[L:suit(E), L:arg2(E,X)] . 
The rule in (3b) might be further abbreviated to 

(4) by leaving out the unmodified arg3, because it 
is handled by a single metarule, which passes on 
all semantic entities that are preserved between 
source and target representation. This also makes 
the rule for (3e) superfluous, since it uses an inter- 
lingua predicate for the anaphor in German and 
English. 

The rule in (3c) illustrates how an additional 
condition ( [LI :passen(E)] ) might be used to 
trigger a specific translation of schlecht into not 
good in the context of passen. The standard trans- 
lation of schlecht to bad is blocked for verbs 
like suit, pthat presuppose a positive attitude 
adverbial.El One main advantage of having such 
conditions is the preservation of the modularity 
of transfer equivalences because we do not have to 
specify the translation of the particular verb which 
only triggers the specific translation of the adver- 
bial. Consequently, the transfer units remain small 
and independent of other elements, thus the in- 
terdependences between different rules are vastly 
reduced. The handling of such rule interactions is 
known to be one of the major problems in scaling 
up MT systems. 

A variation on example (1) is given in (5). 

(5) a. Das pafit mir echt schlecht. 

b. That really doesn't suit me well. 

The translation is exactly the same, but the Ger- 
man verb passen takes an indirect object mir in- 
stead of the adjunct 6ei-phrase in (1). The appro- 
priate transfer rule looks like (6a) which can be 
reduced to (6b) because no argument switching 
takes place and we can use the metarule again. 

(6) a. [L: passen (E) ,L:arg2(E,X) ,L:arg3(E,Y)] <-> 

[L:suit(E) , L:arg2(E,X) , L : arg3(E, Y)] . 
b. [L:passen(E)] <-> [L:suit(E)]. 

In a purely monotonic system without overriding 
it would be possible to apply the transfer rule in 
(6b) to sentence (1) in addition to the rule in (4) 
leading to a wrong translation. Whereas in the 
underlying rule application scheme assumed here, 
the more general rule in (6b) will be blocked by 
the more specific rule in (4). 

The specificity ordering of transfer rules is 
primarily defined in terms of the cardinality of 
matching subsets and by the subsumption order 
on terms. In addition, it also depends on the 
cardinality and complexity of conditions. For the 
passen example at hand, the number of match- 
ing predicates in the two competing transfer rules 
defines the degree of specificity. 

5 Instead of using a specific lexical item like passen 
the rule should be abstracted for a whole class of verbs 
with similar properties by using a type definition, e.g. 
type (de ,pos_attitude_verbs , [gehen, passen, . ..]). 
For a description of type definitions see (11) below. 



The following example illustrates how condi- 
tions are used to enforce selectional restrictions 
from the domain model. For example Terrain in 
German might either be translated as appointment 
or as date, depending on the context. 

(7) a. [L:termin(X)] <-> [L: appointment (X)] . 
b. [L:termin(X)] , 

[sort(X)=<~temp_point] <-> [L : date (X) ] . 

The second rule (7b) is more specific, because it 
uses an additional condition. This rule will be 
tried first by calling the external domain model 
for testing whether the sort assigned to X is not 
subsumed by the sort temp_point. Here, the first 
rule (7a) serves as a kind of default with respect to 
the translation of Terrain, in cases where no spe- 
cific sort information on the marker X is available 
or the condition in rule (7b) fails. 

In (8), a light verb construction like einen Ter- 
rainvorschlag raachen is translated into suggest a 
date by decomposing the compound and light verb 
to a simplex verb and its modifying noun. 

(8) [L:machen(E) ,L:arg3(E,X) , 

LI :terminvorschlag(X)] <-> 
[L: suggest (E) ,L:arg3(E,X) , LI: date (X)] . 
We close this section with a support verb example 

(9) showing the treatment of head switching in our 
approach. The German comparative construction 
lieber sein (lit.: 6e more liked) in (9a) is translated 
by the verb prefer in (9b). 

(9) a. Dienstag ist mir lieber. 
b. I would prefer Tuesday. 

(10) [L:support(S,Ll) ,L2:experiencer(S,X) 
Ll:lieb(Y) , LI : comparative (Y)] <-> 
[L:prefer(S) ,L:argl(S,X) ,L:arg3(S,Y)] . 

The transfer rule in (10) matches the decomposi- 
tion of the comparative form lieber into its posi- 
tive form lieb and an additional comparative pred- 
icate together with the support verb sein such that 
the comparative construction lieber sein (Y ist X 
lieber) is translated as a whole to the English verb 
prefer (X prefers Y). 

4 Discussion 

The main motivation for using a semantic-based 
approach for transfer is the ability to abstract 
away from morphological and syntactic idiosyn- 
crasies of individual languages. Many of the tra- 
ditional cases of divergences discussed, e.g. by 
Dorr (1994), are already handled in the Verbmobii 
syntax-semantics interface, hence they do not 
show up in our transfer approach. Examples in- 
clude cases of categorial and thematic divergences. 
These are treated in the linking between syntac- 
tic arguments and their corresponding thematic 
roles. 

Another advantage of a semantic-based trans- 
fer approach over a pure interlingua approach, 
e.g. Dorr (1993), or a direct structural correspon- 
dence approach, e.g. Slocum et al. (1987), is the 



gain in modularity by allowing language indepen- 
dent grammar development. Translation equiva- 
lences relating semantic entities of the source and 
target grammars can be formulated in a grammar 
independent bilingual semantic lexicon. In cases 
where the semantic representations of source and 
target language are not isomorphic, a nontrivial 
transfer relation between the two representations 
is needed. But it is clearly much easier to map be- 
tween flat semantic representations than between 
either syntactic trees or deeply nested semantic 
representations 

An interlingua approach presumes that a sin- 
gle representation for arbitrary languages exists 
or can be developed. We believe from a grammar 
engineering point of view it is unrealistic to come 
up with such an interlingua representation with- 
out a strict coordination between the monolingual 
grammars. In general, a pure interlingua approach 
results in very application and domain specific 
knowledge sources which are difficult to maintain 
and extend to new languages and domains. This 
holds especially in the Verbmobi] context with its 
distributed grammar development. 

Whereas our approach docs not preclude the use 
of interlingua predicates. We use interlingua rep- 
resentations for time and date expressions in the 
Verbmobi! domain. Similarly for prepositions, cf. 
Buschbeck-Wolf and Niibel (1995), it makes sense 
to use more abstract relations which express fun- 
damental relationships like temporal location or 
spatial location. Then it is left to the language spe- 
cific grammars to make the right lexical choices. 

(11) a. type (de , temp_loc , [an, in,um,zu] ) . 

b. am Dienstag, ira Mai, urn drei, zu Ostern 

c. type (en, temp_loc , [on, in, at] ) . 

d. on Tuesday, in May, at three, at Easter 

The class definitions in (11a) and (11c) cluster 
together those prepositions which can be used to 
express a temporal location. The names de and en 
are the SL and TL modules in which the class is 
defined, temp_loc is the class name and the list 
denotes the extension of the class, (lib) and (lid) 
show possible German and English lexicalizations. 

(12) [temp_loc(E,X)] , [sort(X)=<time] <-> 
[temp_loc(E,X)] . 

The interlingua rule in (12) identifies the abstract 
temporal location predicates under the condition 
that the internal argument is more specific than 
the sort time. This condition is necessary be- 
cause of the polysemy of those prepositions. Dur- 
ing compilation the SL class definition will be au- 
tomatically expanded to the individual predicates, 
whereas the TL class definition will be kept unex- 
panded such that the target grammar might be 
able to choose one of the idiosyncratic preposi- 
tions. 

Mixed approaches like Kaplan ct al. (1989) can 
be characterized by mapping syntax as well as 
a predicate-argument structure (f-structure) . As 



already pointed out, e.g. in (Sadler and Thomp- 
son, 1991), this kind of transfer has problems with 
its own multiple level mappings, e.g. handling of 
verb-adverb head switching, and does not cleanly 
separate monolingual from contrastive knowledge, 
either. In Kaplan and Wedekind (1993) an im- 
proved treatment of head switching is presented 
but it still remains a less general solution. 

A semantic approach is much more indepen- 
dent of different syntactic analyses which are the 
source of a lot of classical translation problems 
such as structural and categorial divergences and 
mismatches. In our approach grammars can be de- 
veloped for each language independently of the 
transfer task and can therefore be reused in other 
applications. 

At first glance, our approach is very similar 
to the semantic transfer approach presented in 
Alshawi et al. (1991). It uses a level of underspec- 
ified semantic representations as input and output 
of transfer. The main differences between our ap- 
proach and theirs are the use of flat semantic rep- 
resentations and the non-recursive transfer rules. 
The set-oriented representation allows much sim- 
pler operations in transfer for accessing individual 
entities (set membership) and for combining the 
result of individual rules (set union). Furthermore, 
because the recursive rule application is not part 
of the rules themselves, our approach solves prob- 
lems with discontinuous translation equivalences 
which the former approach cannot handle well. A 
transfer rule for such a case is given in (4). 

Our current approach is strongly related to 
the Shake-and-Bake approach of Beaven (1992) 
and Whitelock (1992). But instead of using 
sets of lexical signs, i.e. morpho-syntactic lex- 
emes as in Shake-and-Bake, we specify trans- 
lation equivalences on sets of arbitrary seman- 
tic entities. Therefore, before entering the trans- 
fer component of our system, individual lex- 
emes can already be decomposed into sets of 
such entities, e.g. for stating generalizations on 
the lexical semantics level or providing suit- 
able representations for inferences. For example, 
the wh-question word when is decomposed into 
temp_loc(E,X) , whq(X,R), time(R,X) (lit.: at 
which time), hence no additional transfer rules are 
required. Similarly, German composita like Ter- 
minvorschlag are decomposed into its compounds, 
e.g. termin(i2) , n_n(il , i2) , vorschlag(il) 
where n_n denotes a generic noun-noun relation. 
As a result a compositional translation as proposal 
for a date is possible without stating any addi- 
tional translation equivalences to the ones for the 
simplex nouns. 

Another major difference is the addition of con- 
ditions which trigger and block the applicability of 
individual transfer rules. For instance in the spe- 
cific translation of schlecht to not good as defined 
in (3c), without conditions, one would have to add 



the verb passen into the bag to test for such a 
specific context. As a consequence the translation 
of the verb needs to be reduplicated, whereas in 
our approach, the translation of the verb can be 
kept totally independent of this specific transla- 
tion of the adverbial, because the condition func- 
tions merely as a test. 

These examples also illustrates the usefulness 
of labeled conditions, because the negation op- 
erator can take such a label as an argument 
and we can use unification again to achieve 
the correct coindexation. If we would use a hi- 
erarchical semantics instead, as in the origi- 
nal Shake-and-Bake aproach, where the negation 
operator embeds the verb semantics we would 
have to translate schlecht(e), passen(e) into 
not (suit (e), well(e)) in one rule because 
there is no coindexation possible to express the 
correct embedding without the unique labeling of 
predicates. 

Finally, we have filled the lack of an adequate 
control strategy for Shake-and-Bake by develop- 
ing a nonmonotonic control strategy which orders 
more specific rules before less specific ones. This 
strategy allows the specification of powerful de- 
fault translations. Whereas without such an or- 
dering special care is needed to prevent a compo- 
sitional translation in cases where a more specific 
noncompositional translation also exists. 

The same argument about control holds in com- 
parison to the unification-based transfer approach 
on Mimimal Recursion Semantics (MRS) (Copes- 
take et al., 1995; Copestake, 1995). In addition, we 
use matching on first order terms instead of fea- 
ture structure unification. Full unification might 
be problematic because it is possible to add ar- 
bitrary information during rule application, e.g. 
by further unifying different arguments. The other 
main difference is our nonmonotonic control com- 
ponent whereas the MRS approach assumes a 
monotonic computation of all possible transfer 
equivalences which are then filtered by the gen- 
eration grammar. It is difficult to judge the feasi- 
bility of their approach given the fact that only a 
limited coverage has been addressed so far. 

5 Implementation 

A more detailed presentation of the implementa- 
tion aspects of our transfer approach can be found 
in Dorna and Emele (1996). The current transfer 
implementation consists of a transfer rule compiler 
which takes a set of rules like the one presented in 
section [j] and compiles them into two executable 
Prolog programs one for each translation direc- 
tion. The compiled program includes the selection 
of rules, the control of rule applications and calls 
to external processes if necessary. 

Because both the transfer input and the match- 
ing part of the rules consist of sets we can ex- 
ploit ordered set operations during compilation as 



well as at runtime to speed up the matching pro- 
cess and for computing common prefixes which are 
shared between different rules. 

The compiled transfer program is embedded in 
the incremental and parallel architecture of the 
Verbmobii Prototype. Interaction with external 
modules, e.g. the domain model and dialog mod- 
ule or other inference components, is done via a set 
of predefined abstract interface functions which 
may be called in the condition part of transfer 
rules. The result is a fully transparent and modu- 
lar interface for filtering the applicability of trans- 
fer rules. 

6 Summary 

This paper presents a new declarative transfer 
rule formalism, which provides an implementation 
platform for a semantic-based transfer approach. 
This approach combines ideas from a number of 
recent MT proposals and tries to avoid many of 
the well known problems of other transfer and in- 
terlingua approaches. 

The declarative transfer correspondences are 
compiled into an executable Prolog program. The 
compiler exploits indexing for more efficient search 
of matching rules. There is a nonmonotonic but 
rule-independent control strategy based on rule 
specificity. 

Currently, the transfer component contains 
about 1700 transfer rules. Thanks to the set ori- 
entation and indexing techniques we did not en- 
counter any scaling problems and the average run- 
time performance for a 15 word sentence is about 
30 milliseconds. 

Future work will include the automatic acqui- 
sition of transfer rules from tagged bilingual cor- 
pora to extend the coverage and an integration of 
domain specific dictionaries. 
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